The T-recs Approach for Table Structure Recognition and Table Border Determination

نویسندگان

  • Thomas G. Kieninger
  • Andreas Dengel
چکیده

We present a snapshot of the ongoing research in the eld of table structure recognition and analysis. The prototypical T-Recs system (Table RECognition System) relies on the word level layout (bounding box geometry) as primary input. It moreover considers the textual information and potentially available delineations as further input. This article resumes the basic ideas and system features as described in [1] and [2] (downloadable from our demo page: www.dfki.uni-kl.de/ kieni/t recs/). This page also allows to interactivly load and change some prede ned documents which demonstrate the main features and strengths of the approach. Next, we will sketch some problems which encounter when applying T-Recs to business letters like o ers, invoices etc. and discuss appropriate solutions for that problem. 1 SUMMARY OF CURRENT SYSTEM Based on the word level layout information which can be derived from either OCR documents (e.g. the Xerox XDOC-format) or plain ASCII les by using a built-in preprocessor, the TRecs system performs a direct bottom-up clustering of word segments to blocks (not the conventional word-line-block order!) and thus transforms the document representation from the word level layout to a partial logical representation (see Figure 1). Table Structures Blocksegments Wordsegments Structural Analysis Segmentation Logic

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Structure Recognition Based On Robust

This paper presents an eecient approach to identify tabular structures within either electronic or paper documents. The resulting T-Recs system takes word bounding box information as input, and outputs the corresponding logical text block units (e.g. the cells within a table environment). Starting with an arbitrary word as block seed the algorithm recursively expands this block to all words tha...

متن کامل

Numerical and Experimental Study of Soil-structure Interaction in Structures Resting on Loose Soil Using Laminar Shear Box

In the present work, the effect of Soil-Structure Interaction (SSI) in low frequency structures resting on loose soil, through numerical modelling and shaking table tests have been studied. In theoretical studies two types of models namely fixed base and flexible base structure were subjected to three selected earthquake records. Nonlinear dynamic analysis was employed for all of the numerical ...

متن کامل

Computational determination of character table and symmetry of fullerenes cage as C24 and C28

Fullerene chemistry is nowadays a well-established field of both theoretical and experimental investigations‎. This study considers the symmetry of small fullerenes cage C24 and C28‎. ‎Using PM3 program for C24 and C28 fullerenes, Oh and Td symmetry were confirmed, respectively‎. ‎ The mentioned algorithm to compute the automorphism group of these fullerenes with connectivity and geometry of th...

متن کامل

Performance evaluation of the factors influencing Tourists General Satisfaction in the Border Cities. Baneh Border City

According to WTTC (World Tourism and Travel Council) forecast tourism contribution of global GDP will be about 6000 billion dollar in 2020 and will create 300 million job Therefore, Tourism could be considered as multidimensional field that response to tourists needs with diversity interests and motivations. Shopping is the most necessity needs and it is the one popular activity for tourists. B...

متن کامل

T-RECS: Training for Rate-Invariant Embeddings by Controlling Speed for Action Recognition

An action should remain identifiable when modifying its speed: consider the contrast between an expert chef and a novice chef each chopping an onion. Here, we expect the novice chef to have a relatively measured and slow approach to chopping when compared to the expert. In general, the speed at which actions are performed, whether slower or faster than average, should not dictate how they are r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999